Journal of Forestry Research, 13(4): 299-308 (2002) 


299 


Experimental genomics: The application of DNA microarrays in 
cellular and molecular biology studies 

LUOXiao-yan \ TANG Wei 2 

('Department of Cell and Developmental Biology, University of North Carolina, Chapel Hill, NC 27599, USA) 

CForest Biotechnology Group, North Carolina State University, Centennial Campus, P.O.Box 7247, Raleigh, NC 27695, USA) 


Abstract The genome sequence information in combination with DNA microarrays promises to revolutionize the way of cellu¬ 
lar and molecular biological research by allowing complex mixtures of RNA and DNA to interrogated in a parallel and quantita¬ 
tive fashion, DNA microarrays can be used to measure levels of gene expression for tens of thousands of gene simultane¬ 
ously and take advantage of all available sequence information for experimental design and data interpretation in pursuit of 
biological understanding. Recent progress in experimental genomics allows DNA microarrays not simply to provide a cata¬ 
logue of all the genes and information about their function, but to understand how the components work together to comprise 
functioning cells and organisms. This brief review gives a survey of DNA microarrays technology and its applications in ge¬ 
nome and gene function analysis, gene expression studies, biological signal and defense system, cell cycle regulation, 
mechanism of transcriptional regulation, proteomics, and the functionality of food component. 
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Introduction 

The DNA microarray technology is a new and powerful 
technology that will substantially increase the speed of 
cellular and molecular biological research. The DNA mi¬ 
croarray technology was invented in Stanford (Schena et 
al. 1995) using a small set of Arabidopsis ( Arabidopsis 
thaliana) ESTs and has been applied to several model 
organisms including colon bacillus ( Escherichia coli) 
(Richmond et al. 1999), yeast (Saccharomyces cerevisiae) 
(Chu et a/.1998), fruit fly ( Drosophila melanogaster)( White 
et al. 1999), nematode ( Caenorhabditis elegans) (The C. 
elegans Sequencing Consortium 1998), mouse (Tanaka et 
al. 2000) and human (Schena et al. 1996). The major ad¬ 
vance on DNA microarray technology, as compared to 
conventional techniques, results from the small size of the 
array, which allows for a higher sensitivity, enables the 
parallel screening of larger numbers of genes and provides 
the opportunity to use smaller amounts of starting material. 
The introduction of fluorescent probes has made miniaturi¬ 
zation of arrays possible (Jordan et al. 1998). The scale of 
gene expression analysis is not only extended by the 
simultaneous analysis of large numbers of genes, but also 
because microarrays can be produced in series facilitating 
comparative analysis of a large number of samples. Analy¬ 
sis of gene expression is important in many fields of bio¬ 
logical research, since changes in the physiology of an 
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organism or a cell will be accompanied by changes in the 
pattern of gene expression. Gene expression analysis can 
be used to obtain insight in the physiological conse¬ 
quences of genetic modification in animals and plants. 
Several techniques for the analysis of gene expression at 
the mRNA-level, such as Northern blotting (Alwine et al. 
1997), dot blot analysis (Lennon et al. 1991), differential 
display (Liang and Pardee 19992), and serial analysis of 
gene expression (SAGE) (Velculescu et al. 1995) are 
available. These methods have their disadvantages, which 
render them unsuitable if large numbers of expression 
products have to be analyzed simultaneously. Northern 
blot analysis only allows limited numbers of mRNAs to be 
studied at the same time. Dot blot analysis requires a rela¬ 
tively large amount of material due to the size of the filters. 
Differential display does enable the simultaneous detection 
of multiple differences in gene expression and screening is 
based on differences in mRNA length and not identity. 
SAGE involves complex sample preparation procedures, 
requires extensive DNA sequencing and is not very sensi¬ 
tive. Recently, substantial improvement in sensitivity and 
throughput of expression screening has been obtained by 
the introduction of DNA microarray technology (Watson et 
al. 1998; Duggan et al. 1999; Graves 1999). 

The study of gene expression by DNA microarray tech¬ 
nology is based on hybridization of mRNA to a high-density 
array of immobilized target sequences, each correspond¬ 
ing to a specific gene. Sample mRNAs are labeled as a 
complex mixture by incorporation of a fluorescent nucleo¬ 
tide by oligo (dT)-primed reverse transcription. The labeled 
pool of sample mRNAs is subsequently hybridized to the 
array, where each messenger will quantitatively hybridize 
to its complementary target sequence. The fluorescence at 
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each spot on the array is a quantitative measure corre¬ 
sponding to the expression level of the particular gene. 
The use of two differently labeled mRNA samples allows 
quantitative comparison of gene expression in both sam¬ 
ples (Fig. 1) (van Hal et al. 2000). 


mRNA Ijrom Sample 1 
Labelled by 
Cy3-dCTP 
Labelled cDNA I 


. 1 IIU 
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mRNA II front Sample II 
Labelled by 
Cy5-dCTP 
Labelled cDNA II 


1 1IOI 
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Hybridize to the microarrays of target sequence such 
as PCR fragment, cDNA clone, or oligonucleotide 


Measurement of fluorescence by laser seamier, non¬ 
laser seamier, or CCD camera 


Fig. 1. The principle of gene expression analysis by DNA mi¬ 
croarray technology 

mRNA samples are reverse transcribed to cDNA while fluorescently 
labeled nucleotides are incoporated (Cy3 and Cy5 labelled dCTP or 
dUTP are often used for this purpose). The usage of multiple dyes 
allows the comparison of multiple RNA samples on one single array. 

Currently, two complementary techniques are available. 
One is fragment-based DNA microarrays, and another is 
oligonucleotide-based Affymetrix chips. DNA microarrays 
allow the simultaneous hybridization of two fluorescently 
labeled probes to an array of immobilized DNA fragments 
such as PCR-amplified DNA sequences, each correspond¬ 
ing to a specific gene. After scanning of the microarray with 
a laser scanner, the signal for each DNA fragment reflects 
the abundance of the corresponding messenger RNA in 
the sample. The use of two differently labeled samples 
allows the quantitative comparison of gene expression 
between a control and a test experiment. Alternatively, 
Affymetrix chips consist of an array of oligonucleotides of 
usually 20-25 bp, which have been synthesized in situ on 
a glass surface using photolithography. Each gene to be 
analyzed is typically represented by twenty specific probes 
on the chip (Fig. 1). Different methods for labeling RNA are 
available and allow a quantitative measurement of tran¬ 
script abundance. As opposed to fragment-based microar¬ 
rays, oligonucleotide arrays require prior knowledge of 
DNA sequence information but permit single base change 
analysis. DNA microarray technology could be useful in: (1) 
genome and gene function analysis; (2) risk assessment in 
transgenic agricultural products by analysis of altered gene 
expression; (3) studies on biological signal and defense 
system; (4) un-raveling gene function and metabolic path¬ 
ways in animals and plants by mutant analysis; (5) cel! 
cycle and transcriptional regulation research; (6) protein- 


protein interaction; (7) investigation of functional and toxi¬ 
cological effects of food components. Although DNA mi¬ 
croarray technology can be widely used for detection and 
identification of complex samples in most areas of current 
biology, this paper gives a survey of DNA microarray tech¬ 
nology and its use in the important subjects of cellular and 
molecular biology involved in experimental genomics. 

Manufacturing of DNA microarrays 

At present, two main techniques are being developed for 
manufacturing DNA microarrays (Fig. 2). The first ap¬ 
proach, DNA chip, encompasses direct synthesis of oli¬ 
gonucleotides on a solid surface based on photolithogra- 
phyas that was developed by Fodor et al. (1991). Specific 
areas of a glass surface that derivatised with linker mole¬ 
cules that carry a photo-labile protective group, are selec¬ 
tively illuminated by using a photo-mask. Subsequently, 
the surface is incubated with a solution containing a photo- 
protected nucleotide, which will only be coupled to the 
light-activated areas. After removal of the excess nucleo¬ 
tide, a second photo-mask is used to de-protect other ar¬ 
eas on the surface and subsequently another type of nu¬ 
cleotide is coupled to these areas. By repeating this proce¬ 
dure, a defined set of oligonucleotides is synthesized on 
the surface (Fig. 2) (Chee et al. 1996; McGall et al. 1996; 
Lipshutz et al. 1995). The method allows the manufactur¬ 
ing of microarrays with very high densities at 250, 000 oli¬ 
gonucleotide spots per cm 2 and facilitates the production of 
large series of identical arrays. However, it is prohibitively 
expensive and has no flexibility in design. The second ap¬ 
proach, DNA micro-dispenser, is more flexible and can be 
performed in a regular molecular biology laboratory 
(Schena et al. 1995). Small quantities of DNA solution, with 
a minimum volume of approximately 50pl, are dispensed 
onto a solid surface. The number of micro-dispensing ro¬ 
bots commercially available is quickly increasing and the 
performance of these machines is continually improving. 
DNA micro-dispensers apply the DNA solution with a pin 
that touches the solid surface (Fig. 2). The density of the 
spots depends on the skills of the dispensing device. DNA 
dispensing is flexible and allows for constant update of the 
array. Oligonucleotides as well as longer known or un¬ 
known DNA sequences can be deposited on the array in a 
format that can vary, if necessary, from one array to the 
next. Using micro-dispenser it should also be possible to 
synthesize oligonucleotides directly on an array or to de¬ 
posit molecules other than nucleic acids, for example pro¬ 
teins. 

DNA molecules that carry a 5% modification can be co¬ 
valently bound to a glass surface that carries reactive 
groups (Schena et al. 1996; Rogers et al. 1999). Presently 
surface modified glass slides (silylated, poly-L-lysine 
coated) are most commonly used as a substrate. Besides 
glass, other materials are being explored as well, such as 
gold-coated slides, polyacrylamide gel pads, and nitrocellu- 
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lose or nylon membranes (Proudnikov et al. 1998). These 
alternatives aim at better signal to noise ratios and im¬ 
proved reproducibility. DNA microarray technology is rap¬ 
idly evolving in many aspects. To analyze gene expression 
in animal and plant biopsies, micro-dissected specimens, 
or small plant tissues, the overall sensitivity has to be in¬ 
creased. This will depend on technical aspects such as the 
quality of the scanners and arrayers, fluorescent dyes with 
improved quantum yields and lower background, or target 
supports with reduced background and more target se¬ 
quence binding capacity (Wittung et at. 1996; Geiger et at. 

1998) . The elucidation of metabolic pathways or identifica¬ 
tion of novel responding genes necessitates the use of 
large arrays often with undefined target sequences. Com¬ 
pared to conventional methods, DNA microarrays have the 
advantage that expression of large sets of genes can be 
determined in parallel. (Spellman et al. 1998; Iyer et at. 

1999) . If the number of genes of an organism is not too 
large and all the genes are known, sequences correspond¬ 
ing to all open reading frames can be spotted on the array 
that allows simultaneous experimental analysis of all 
mRNAs. 



Fig. 2 Two approaches for DNA microarray manufacturing. 

A- Direct synthesis of oligonucleotides on a solid surface based on 
photolithographyas; B~ DNA micro-dispensers (van Hal et al. 2000). 

Hybridization, scanning, and data analysis 

After mRNA was purified from the biological sample, 
mRNA should be labeled fluorescently by incorporation of 
a modified nucleotide during cDNA synthesis prior to hy¬ 
bridization (Fig. 1). Criteria for selecting a fluorophore are a 
narrow excitation and emission peak, a high level of pho¬ 
ton-emission, resulting in better sensitivity, and resistance 
to photo-bleaching. Presently, the fluorescent cyanine dyes 
Cy3 and Cy5 are most often used. Cy5 has the disadvan¬ 
tage that it sometimes gives high background fluorescence 
on glass surfaces and is more sensitive to photo-bleaching 
than Cy3. The development of fluorescent dyes with im¬ 
proved characteristics and compatible scanners will cer¬ 


tainly facilitate a further increase in the signal to noise ratio. 
The ability to use multiple dyes in a single experiment will 
allow better comparison of several different mRNA sam¬ 
ples simultaneously. The required hybridization conditions 
such as sample concentration, ionic strength, temperature 
largely depend on the size of the DNA fragments present 
on the array and must be determined for a given experi¬ 
mental set up. Although existing sequence databases are 
of great value for selecting relevant genes, the identifica¬ 
tion of unknown genes dictates the use of cDNA libraries 
as source. The correlation between transcript concentra¬ 
tion and hybridization signal can be determined by spiking 
control RNA transcripts in a background of a total messen¬ 
ger population. Work of Lemieux et al. (1998) has indicated 
that the RNA expression level and hybridization signal 
show a linear correlation providing that the target DNA is 
present on the array in at least a ten-fold excess. 

For imaging, there are several types of microarray read¬ 
ers are available. They can be subdivided into CCD cam¬ 
eras, non-confocal laser scanner sand confocal laser 
scanners. CCD cameras allow fast scanning but their 
imaging area is rather small. Confocal laser scanners have 
the advantage that the light collection efficiency and reso¬ 
lution is usually much higher than for the other two sys¬ 
tems. In addition, confocal laser scanners have a small 
depth of focus that reduces artifacts but also requires more 
scanning precision. In general, a hybridized microarray 
was scanned with a confocal laser scanner ScanArray3000 
(General Scanning) at 543 nm (Cy3, GHeNe laser) or 632 
nm (Cy5, RHeNe laser) repeatedly. Laser power was at 
75%. To obtain a reliable quantitative result the microarray 
spot diameter should be at least five to ten times the pixel 
size, which is at this moment about 10 mm for confocal 
laser scanners. Hybridization may be improved by 
developing new approaches, such as the use of electric 
fields to enhance the hybridization rate and precisely 
regulate the stringency of the hybridization (Edman et al. 
1997; Sosnowski etal. 1997). 

The complex hybridization data are generated within a 
short time and sophisticated software is needed to keep a 
good overview, to assess the quality of the data and to 
help find statistical significance and relevant correlations 
within, and between different arrays and experiments. Link¬ 
ing microarray-derived gene expression data to DNA se¬ 
quence information and experimental data is available in 
public databases. For example, one can find genes that 
are part of the same metabolic route, compare their ex¬ 
pression behavior under all the tested conditions and pre¬ 
sent the data in an orderly and visual way. The software 
should be able to trace back a spot on an array to a clone 
in the freezer and its sequence in a database. These data 
handling software packages are commercially available but 
large improvements are still necessary, especially with 
regard to spot location, expression pattern recognition, and 
comparison of larger numbers of different experimental 
samples, statistical analysis and visual data presentation. 
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Applications of DNA microarrays 

Genome and gene function analysis 

Functional analysis, through parallel expression monitor¬ 
ing, should help researchers better understand the funda¬ 
mental mechanisms that underlie plant growth and devel¬ 
opment. By accumulating databases of expression infor¬ 
mation as a function of tissue type, developmental stage, 
hormone and herbicide treatment, genetic background and 
environmental condition, it should be possible to identify 
the genes involved many aspects of current biology. Mi¬ 
croarray analysis provides a way to link genomic sequence 
information and functional analysis. Recent experiments 
involving the use of cDNA microarrays for expression 
monitoring indicated the immediate applicability of DNA 
chips in agricultural biotechnology. The DNA microarray 
technology may be used to collect much of the data that 
are obtained presently by Southern and northern hybridiza¬ 
tion approaches in a more highly parallel fashion. Genomic 
DNA samples can be manipulated experimentally to select 
for particular regions before hybridization to obtain specific 
types of information. The applications of DNA microarrays 
to genomic studies primarily involve the search for single¬ 
nucleotide polymorphisms, which may have considerable 
importance (Cheung et al. 1999). According to defect and 
the best treatment option, one can use analytical tech¬ 
niques such as genetic-linkage mapping or association 
analysis (Baldwin et al. 1999) to discover genetic predis¬ 
positions to disease, and to classify diseases. When the 
locations of nearly all of the specific defects have been 
determined, the work will become quite a bit easier, be¬ 
cause a specific subset of probes can be constructed for a 
given purpose. 

The “pharmaco-genomics” is another interesting poten¬ 
tial application of DNA microarrays in genomic suggested 
recently (Schena et al. 1996; Graves 1999). Because each 
individual has a slightly different genetic makeup, each will 
have a unique set of polymorphic sites; although these 
polymorphisms might not be sufficiently aberrant to cause 
disease, some of them would determine how each individ¬ 
ual responds to a particular drug. One might know ahead 
of time that he or she would have an adverse reaction to a 
particular drug or that it would be ineffective for the disease 
by the proper DNA microarrays. Other genomic applica¬ 
tions of DNA microarrays include the identification of crimi¬ 
nals or of blood relatives, the tissue typing in organ-donor 
selection, and the studies on evolution and interspecies 
similarities (Richmond and Sommerville 2000). Biologists 
have rapidly recognized the importance of DNA microarray 
technology, as illustrated by ambitious genomic programs 
(Richmond et al. 1999; Wang et al. 2000) and by the es¬ 
tablishment of core microarray facilities. Recently, the first 
large-scale analysis was performed to identify nitrate re¬ 
sponsive genes using 5 524 unique cDNA clones repre¬ 
senting approximately a quarter of the Arabidopsis genome 


(van Hal et al. 2000; Wang et al. 2000). Novel nitrate- 
induced genes were found and multiple responses to ni¬ 
trate were observed at the transcript level. These demon¬ 
strated the power of such global investigation for gene dis¬ 
covery and for the analysis of regulatory networks. With 
the completion of genomic sequencing projects in several 
model organisms, genome-wide microarrays will become 
essential tools for discovering the function of genes. 

Gene expression studies 

Measuring transcript levels for thousands of genes in 
parallel is one of the more widespread applications of DNA 
chip technology. Microarrays for gene expression analysis 
were the first biological application of DNA chip technology. 
Both oligonucleotide and cDNA microarrays work well for 
transcript monitoring (Desprez et al. 1998; Durrant et al. 
2000; Kehoe et al. 1999; Maleck et al. 2000; Matsumura et 
al. 1999; Roth et al. 1998; Rushton and Somssich 1998) 
One advantage of oligonucleotide microarrays for expres¬ 
sion studies is that chips can be prepared directly from 
sequence databases, obviating the need for cumbersome 
clone handling and sample tracking. Another advantage of 
oligonucleotide microarrays is that transcripts from individ¬ 
ual members of multi-gene families that share extensive 
sequence homology can be easily distinguished by syn¬ 
thesizing oligonucleotides to regions of non-identity. Mi¬ 
croarrays of cDNAs possess some distinct advantages 
over oligonucleotide microarrays, including the ease of 
prototyping and data analysis, immediate accessibility to 
the research community, and the capacity to examine large 
numbers of novel cDNAs in gene discovery applications. 
Both oligonucleotide and cDNA microarrays will be widely 
used for gene expression analysis. Three strategies can be 
adopted when developing a DNA chip for gene expression 
studies. The first consists of strategically selecting genes 
that are known to play an important role in a particular bio¬ 
logical pathway. The second strategy, which is restricted to 
cDNA microarrays, is to use clones from a library prior to 
sequence analysis (Rushton and Somssich 1998). The 
third strategy is to generate a chip with the complete ex¬ 
pressed sequence content of an organism (Thiellement et 
al. 1999). Genome chips provide the best chances for dis¬ 
covering new interactions between metabolic and genetic 
pathways and for gaining functional insights into novel ex¬ 
pressed sequences. 

The recent advent of tools enabling the global analysis 
of gene expression coupled to the genome sequencing of 
model species like Arabidopsis or rice are dramatically 
changing the way experimentation is done and provide the 
research community with the hope of answering more 
general questions. One technological advance, DNA mi¬ 
croarrays, is already becoming a standard tool for genome¬ 
wide monitoring of gene expression in animal studies and 
is starting to contribute to the field of plant biology. Models 
established by DNA microarrays serve to organize current 
information, relationship and hypothesis, and can be tre- 
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mendously helpful for testing new hypothesis, interpreting 
new observations, designing experiments, and predicting 
the lively effects of particular chemical, genetic or cellular 
perturbations. DNA microarrays are already being used to 
study how cells respond to environmental changes and 
stress through changing mRNA patterns. It has been sug¬ 
gested that this technology might be extended to studies of 
environmental toxicity caused by dioxin or mercury, by 
looking for subtle changes in gene expression 9, and to 
evaluate the many products resulting from combinatorial 
chemistry, which might not cause obvious changes in 
cellular appearance or behavior but could cause subtle 
metabolic changes that would show up when the mRNA 
was interrogated by an array. One day, it might be possible 
to control the life cycle of an animal or plant much more 
precisely or to find an efficient insecticide that does not 
affect other species based on DNA microarrays. 
Cancerous cells often have a number of unique 
characteristics, including loss of hetero-zygosity and fusion 
transcripts, and changes in tumor-suppressor genes 
(Cheung et al. 1999; DeRisi et al. 1997) or oncogenes. 
Some of these changes might be detected more easily 
through expression changes. It is likely that the activity of 
certain genes will be up-regulated and others down- 
regulated in actively dividing malignant cells. Arrays might 
eventually be used to predict whether a particular tumor 
would respond to a particular drug and to obtain an early 
indication of recurrence following remission or a seemingly 
effective therapy. 

Biological signal and defense system 

Recent progress in understanding biological signal and 
defense system has highlighted an interacting network of 
signaling pathways leading to the induction of numerous 
genes. The combination of expression data with other bio¬ 
chemical or metabolite measurements seems another 
promising approach. Induced defense has received a lot of 
attention and over the years a large number of genes en¬ 
coding defense-related proteins have been identified. A 
vast majority of these genes are induced after the plants 
were attacked by diverse aggressors such as microbial 
pathogens, viruses, or insects. Understanding the signaling 
machinery that links the perception of the incoming enemy 
to specific changes in gene expression has been the focus 
of recent research and several key molecular components 
have been isolated in Arabidopsis thaliana (Glazebrook 
1999). The emerging picture is that a complex network of 
interdependent signaling pathways convey the information 
on the nature of the aggressor and allows the plant to 
mount an appropriate defense response (Reymond et al. 
1998). Plants have to deal with a vast range of pathogens 
and it is not known what proportion of the genome is allo¬ 
cated to defense. 

The described transcript profiles during systemic ac¬ 
quired resistance (SAR), a defense reaction known to de¬ 
velop in systemic leaves after an initial pathogen attack of 
local leaves (Maleck et al. 2000) indicated that 4.3 % of the 


genes (300 out of 7 000) were involved in the SAR re¬ 
sponse. The metabolic profiling technique has also re¬ 
cently permitted the discovery and quantification of fatty 
acid-derived molecules that accumulate during wounding 
and pathogenesis (Vollenweider et al. 2000). The role of 
these molecules as biological regulators implicated in de¬ 
fense will be tested by microarray analysis, highlighting a 
potential application of transcript profiling methods for dis¬ 
covering the function of new metabolites. When the reper¬ 
toires of transcripts and metabolites measured in a single 
experiment increase to genome-scale levels, the challenge 
will be to integrate these complex databases and to extract 
meaningful biological information. In order to fully under¬ 
stand complex defense responses, input from proteomic 
and metabolomic studies will be essential (Roessner et al. 
2000; Trethwey et al. 1999). 

Cell cycles regulation 

A much larger fraction of cell cycle-modulated genes is 
in DNA synthesis, cell growth or cell division. Although 
there is a strong correlation between distinct experimental 
profiles and functional assignment, not all genes involved 
in DNA replication are expressed periodically in the cell 
cycle, and some gene that do not need to be cell cycle- 
regulated are transcribed in a periodic fashion (Tanaka et 
al. 2000; Velculescu et al. 1995). Studies of cell cycle regu¬ 
lation have focused on genes with cell cycle-specific func¬ 
tions. These genes whose functions are only needed for a 
part of the cycle are directly involved in DNA replication 
and mitosis. For some such genes, transcriptional regula¬ 
tion may be a matter of conserving resources. Genes 
needed for cycling are evidently not needed during the 
dormant period, but very much needed immediately after¬ 
ward. Thus, cell cycle-regulated expression may ensure 
that necessary gene products are always available to cy¬ 
cling cells. In this regard, it is interesting to note that the 
purine-rich motif AAGAAAAA (Spellman et al. 1998) is 
thought to be important for response to glucose; this motif 
may be important in the switch from stationary phase to 
rapid growth, and we find similar motifs enriched in the 
promoters of several types of cell cycle-regulated genes. 
That is, these genes may be growth regulated as well as 
cell cycle regulated. Other genes with cell cycle-specific 
functions act as regulators or switches. It is not only impor¬ 
tant when exactly they are on but also when they are off. 
Transcriptional regulation of a gene controlling a switch 
can be central to its function (Velculescu et al. 1995; 
Graves 1999). 

Cell cycle-regulated transcription can be used to build a 
structure in a highly controlled way. This can be illustrated 
with some parallels between the strategy of the cell for 
regulating DNA replication and its strategy for regulating 
differentiation. Transcriptional controls provide key compo¬ 
nents of the initiation complexes at certain times so that 
the complexes can be built in an orderly manner; however, 
the complexes cannot easily later be rebuilt at an inappro- 
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priate time, partly because the components are no longer 
available. Many cell cycle-regulated genes whose func¬ 
tions are essentially not cell cycle specific. The best single 
example of such a gene may be PMA1, encoding the ma¬ 
jor plasma membrane proton pump, a stable protein. The 
PMA1 function is essential, and although its function is 
required throughout the cell cycle (van Hal et al. 2000; 
Spellman et al. 1998), its transcription is strongly periodic. 
With the DNA microarrays analysis, Spellman et al. (1998) 
found 800 yeast genes whose transcripts oscillate through 
one peak per cell cycle. They defined these 800 genes by 
using an empirical model of cell cycle regulation, whose 
threshold was somewhat arbitrary. Below this threshold 
there may well be genes whose expression is truly periodic 
and whose periodicity might even have biological signifi¬ 
cance. They observed independently that their expression 
was affected by induction of Cln3p and Clb2p. Although 
the basis of the regulation of the remaining genes and 
some of the detailed behavior of some of the cyclin- 
dependent gene expression remains to be elucidated, 
there will be increasing value in genomic data sets as more 
of them accumulate and that together these will fully real¬ 
ize the promise of the genome sequencing projects 
(Graves 1999; Cheung etal. 1999). 

Mechanisms of transcriptional regulation 

When the information on the complete genome se¬ 
quence is available, gene expression data can be used to 
identify new genomic sequence motifs that are over¬ 
represented in the genomic DNA in the vicinity of similarity 
behaving genes and set of co-regulated genes. The corre¬ 
lation between the presence of specific sequence motifs in 
promoter regions and gene expression patterns may be 
stronger than the correlation between functional categories 
and gene expression patterns. Spellman et al. (1998) had 
examined 800 genes for the binding sites of known cell 
cycle transcription factors by DNA microarrays. They found 
that 400 genes were good matches to known sites relevant 
to the phase of peak expression and 280 of these same 
genes showed a significant response to Cln3p or Clb2p 
induction. In addition, they identified as cell cycle regulated 
other sets of genes that form functional pathways and were 
known to be co-regulated or about whose regulation some¬ 
thing is known. The DNA microarray data give us a partial 
picture of the logical circuitry of transcriptional controls in 
the cell cycle. A large number of genes are induced in G1 
and S by the action of Cln3p-Cdc28p on MBF and SBF. 
Furthermore, by M they become repressed by the action of 
Clb2p-Cdc28p. At the same time, Clb2p-Cdc28p, acting 
through MCM1 1 SFF, induces its own constellation of 
genes 50. These genes include the important transcription 
factor Swi5p; once its transcription has been induced, it is 
allowed to enter the nucleus (van Hal et al. 2000; Rushton 
and Somssich 1998). The loss of Clb2-Cdc28 activity 
causes a collapse in the transcription of all Clb2p- 
dependent transcripts and allows Cln3p-Cdc28p to reacti¬ 


vate MBF and SBF to begin a new cell cycle. However, the 
oscillation in the genes expressed in M/G1 phase from an 
MCM1 site (the ECB) and what makes expression of these 
genes cell cycle regulated remain to be explained (Duggan 
et al. 1999; Rushton and Somssich 1998). 

In addition, microarrays will help in the identification of 
genes whose expression is controlled by known transcrip¬ 
tion factors. Transcription factors could be over-expressed 
or silenced in transgenic plants and the effect on gene ex¬ 
pression measured by microarray analysis. This was 
achieved successfully in yeast for defining target genes 
modulated by transcription factors involved in oxidative 
stress (Wang et al. 2000). Another particularly promising 
way of using microarrays for understanding the mecha¬ 
nisms of pathogenesis will be to compare the responses 
induced by various pests or microorganisms. For instance, 
an analysis of transcript profiles after challenging plants 
with different pathogens or stimuli might answer the ques¬ 
tion concerning host discrimination between pathogens 
and might help identifying transcript profiles among host 
responses. However, a larger effort will be necessary be¬ 
fore pathogen specific-transcript profiles can be defined 
but this opens the perspective of being able to precisely 
diagnose plant diseases at the molecular level and will 
undoubtedly be of central importance for agriculture. As 
model genomes are sequenced, it will soon be possible to 
have the complete sets of genes of both the host and the 
pathogen on the same microarray, producing a unique 
molecular view of the interaction between the plant and its 
aggressor (DeRisi et al. 1997; Rushton and Somssich 
1998). 

Molecular bar coding and reverse genetics 

Another experimental genomic activity in which DNA 
chips will play a central role is the characterization of popu¬ 
lations of mutant organisms exposed to various selective 
pressures. The completion of the yeast genome sequenc¬ 
ing project has catapulted the yeast genetics community 
into the post-genome era. One of the next logical steps for 
the yeast community is the systematic preparation of single 
gene deletion mutants corresponding to all 6 000 open 
reading frames (ORFs). Indeed, the ease with which yeast 
genes can be targeted by homologous recombination is a 
central advantage of this model system. Shoemaker 
(Simpson et al. 2000) has proposed a new strategy for 
screening large populations of knock-out mutants in paral¬ 
lel. Their strategy consists of introducing unique molecular 
sequences or 'bar codes’ into each of the 6 000 ORFs in 
the yeast genome. These unique 20-mers can then be 
used for parallel hybridization-analysis with oligonucleotide 
microarrays. In this strategy, a pool of yeast strains con¬ 
taining individual bar codes for all 6 000 genes is subjected 
to a selective pressure. Samples of cells growing under 
selective conditions are taken at incremental times during 
the course of the experiment and the bar-code sequences 
are labeled by multiplex PCR with fluorescent primers. 
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Each pool of fluorescent amplicons is then hybridized to an 
oligonucleotide microarray containing sequences comple¬ 
mentary to each of the amplified bar codes (Reymond et al. 
1998; Trethewey et al. 1999). 

Comparative analysis of fluorescent intensities at each 
bar code position over time provides a quantitative meas¬ 
ure of the fitness of each strain under a given selective 
pressure. Correlation between strain disappearance and 
selective pressure allow global functional analysis of yeast 
gene function. Because genetic bar coding is not yet feasi¬ 
ble in higher organisms, systematic searches for mutant 
alleles will be required for genetic analysis in these sys¬ 
tems. In the case of plants, Arabidopsis is likely the organ¬ 
ism of choice for these efforts. Feldmann and collaborators 
have proposed the use of expressed sequence tags (ESTs) 
as a global means of identifying “insertion elements” in 
genes of interest (van Hal et al. 2000; Aharoni et al. 2000). 
In this approach, PCR is used to screen pools of Arabidop¬ 
sis lines bearing insertion elements at random locations. 
Lines bearing a mutational insert in a gene of interest yield 
a specific amplicon in the PCR amplification. Although this 
method can be performed for all of sequenced genes of 
Arabidopsis, it is both labor intensive and costly in terms of 
primer synthesis. Studies have shown that interlaced 
asymmetric PCR can be used to generate products of 
plant DNA / T-DNA insert junctions (van Hal et al. 2000; 
Lockhart et al. 1996). Hybridization of PCR amplicons to 
microarrays of expressed sequences could be used to 
speed the identification of mutant lines of Arabidopsis. 

DNA microarrays and proteomics 

The elucidation of protein/protein interactions within cells, 
as well as the identification of proteins that bind small 
ligands, is another area in which DNA chips could signifi¬ 
cantly increase the rate of discovery. Bartel (van Hal et al. 
2000; Wodicka et al. 1997) have demonstrated that a ‘pro¬ 
tein-linkage’ map can be created using genomic sequence 
information. The authors correctly suggest that the yeast 
two-hybrid system is probably the best tool available for 
the systematic determination of protein-protein interactions 
in complex organisms. The two-hybrid system uses two 
fusion proteins to activate the transcription of reporter 
genes in yeast (Eisen et al. 1998; Blackstock et al. 1999). 
The first fusion protein contains a DNA binding domain 
fused to a protein of interest, while the second is an acidic 
transcriptional activation domain fused to a second protein 
of interest. Specific interactions between the two chimeric 
proteins leads to transcriptional activation the reporter 
genes, which is easily scored with either color-based as¬ 
says or by auxotrophic complementation. In the conven¬ 
tional two-hybrid approach, the identity of interacting pro¬ 
teins is confirmed by sequence analysis of each clone 
identified in the yeast assay. 

As an alternative to conventional DNA sequencing, it is 
possible to use chip hybridization to identify the genes in¬ 
volved in protein-protein interactions. In the case where 


entire genome sequences are available (e.g., yeast), DNA 
chips can be used for massive, parallel gene re¬ 
sequencing. With the chips, it would be possible to rapidly 
identify all of the clones whose encoded sequences inter¬ 
act in the two-hybrid assay. In this experimental design, 
PCR would be used to amplify and label each cDNA insert 
that encodes an interacting protein. Hybridization to ge¬ 
nome chips would allows identification of the all of the 
genes involved in protein-protein interaction in a single 
hybridization. Phage presentation libraries are also ame¬ 
nable to DNA chip-based detection systems. Phage pres¬ 
entation utilizes fusion proteins encoded by chimeric se¬ 
quences of bacteriophage viral coat proteins and genes of 
interest (Blackstock & Weir 1999). As in the case of the 
two-hybrid system, cDNA libraries encoding fusion proteins 
are created. Phage display libraries can be made for plant 
species by fusing coding sequences to plant virus coat 
proteins (van Hal et al. 2000). Individual clones in the li¬ 
brary can be selected by 'panning’ with any ligand of inter¬ 
est. If this ligand is immobilized to an inert support, the 
bacteriophage clones that bind can be purified by elution. 
The bacteriophage particles can then be used as tem¬ 
plates for PCR, with a set of fluorescent-labeled primers 
that flank the cDNA inserts. The cDNA fragments in this 
enriched population could then be characterized by mi¬ 
croarray hybridization. 

Functionality of food component 

The DNA microarray technologies can be used in the 
safety assessment of genetic modification of food plants. 
This should not only be based on the evaluation of the 
newly introduced trait, but also on possible unintended side 
effects resulting from the genetic alteration. Presently, the 
latter is done by comparison of a limited number of individ¬ 
ual macro- and micronutrients as well as anti-nutritional 
factors, including natural toxins, between the transformed 
line and its traditional counterpart. However, this approach 
has its limitations, since it is unknown whether the currently 
investigated components are the only and most important 
factors to screen with respect to food safety. This type of 
analysis therefore provides only limited information on the 
potential effects on human health. DNA microarray tech¬ 
nology provides the opportunity to screen for unintended 
changes in expression of large numbers of genes in an 
unbiased manner. Van Hal et al. (2000) and Tavazoie et al. 
(1999) had tested this approach for tomato. They have 
constructed tomato cDNA libraries enriched for cDNAs 
preferentially expressed in either green or red tomato. Ar¬ 
rays containing these sequences as well as target se¬ 
quences of known genes will be used for the systematic 
comparison of control and genetically modified tomatoes. 
This will provide information on the natural variation in 
gene expression of tomato at certain stages of ripening as 
well as specific changes due to genetic alteration (Ta¬ 
vazoie et al. 1999) 
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One way to identify relevant biological functions is by 
large-scale expression analysis, since a large number of 
characters can be tested in parallel and a large number of 
different exposures can be compared. We are focusing on 
effects of food components on the human intestine (Bas¬ 
sett et al. 1999). Differentially expressed genes will be 
characterized by sequencing and further expression analy¬ 
sis in a number of different intestinal cell lines as well as in 
vivo. DNA-microarray-assisted gene expression analysis 
offers a powerful tool to identify the genes that are affected 
in a certain mutant. Those genes will be refated to the na¬ 
ture of the mutation and may give an important clue in elu¬ 
cidating the function of the mutated gene in transgenic 
animals or plants (Bassett etai 1999). 

Perspective 

One of the challenges facing the research community 
will be to deal with the flow of data generated by whole 
genome expression studies (Bassett et at. 1999; Er¬ 
molaeva etai 1998). The database developed at Stanford 
University host data from multiple microarray analyses in 
different organisms and offers several useful query tools. It 
will be important in the future to be able to compare ex¬ 
pression data across many different experimental, geo¬ 
graphical or technical platforms. In the next few years, 
DNA microarrays will certainly become a standard tool in 
each laboratory. We can foresee two ways of using the 
technology with the aim of answering different types of 
questions (Fig. 3). First, microarrays containing a repre¬ 
sentation of the whole plant genome will be served to iden¬ 
tify the expression pattern of genes of unknown function, to 
define specific sets of genes responding to various 
stresses or stimuli, to provide a global view on metabolic 
processes, and to assist in comparing wild type and mutant 
organisms. Because the production and routine use of 
whole genome-microarrays might be financially too de¬ 
manding for most research groups, genomic centers pro¬ 
viding access to large microarrays might develop further 
and allow the screening for genes of specific interest. Both 
DNA microarray and Affymetrix chip technologies are 
complementary and suited for the production and analysis 
of whole genome based arrays (Reymond et al. 2000). 

These approaches will be quite useful for a deep and 
thorough analysis of the expression patterns of hundreds 
of genes and would be affordable for most research groups 
(Tavazoie et al. 1999; Ermolaeva et al. 1998). Such stud¬ 
ies may include more detailed characterization of expres¬ 
sion patterns, including replication of multiple experiments 
and time-course analyses. The strategy of fabricating cus¬ 
tom arrays tailored to a specific biological question has the 
advantage of being easier to control at the production side 
while reducing the amount of data to process and integrate. 
Another source of candidate genes involved in specific 
plant responses and which might also constitute boutique 
microarrays will come from differential screening methods, 


such as differential display and RNA finger-printing (cDNA- 
AFLP), or might simply be constituted on the basis of litera¬ 
ture search or in silico-analyses of SAGE and ESTs data¬ 
bases. Whole genome-microarray analyses will be ser¬ 
viced by core facilities and will provide users with the pos¬ 
sibility of performing global analyses of gene expression to 
study regulatory processes or to discover the function of 
unknown genes. Large-scale microarrays will also help in 
defining sets of genes that will be chosen to fabricate less 
expensive custom microarrays within each laboratory. 
These microarrays will be tailored to more specific re¬ 
search projects and will be more useful for routine-based 
transcript profiling (Fig. 3). Dedicated microarrays contain¬ 
ing a well-defined set of defense-related genes have al¬ 
ready demonstrated their utility for the study of wound- and 
insect inducible gene expression and the involvement of 
signal molecules in the wound and pathogen responses 
(Reymond etai. 2000). 


Whole genome microarray analysis: Spotted PCR 
fragments or AfTymctrix chips 

Gene function Biological Cell cycle Mechanism of Functionality 

and expression signal and regulation, transcriptional of food 

studies defense proteomics regulation components 

i i i i i 


Specific sets of genes derived from differential screenings 



Small-scale custom microarrays based on problem- 


specific transcript profiling 


Fig. 3 Schematic overview on the potential use of DNA mi¬ 
croarrays in the future. 

Large-scale microarrays will help in defining sets of genes that will be 
chosen to fabricate less expensive custom microarrays within each 
laboratory. These microarrays will be more useful for routine-based 
transcript profiling. 

Conclusions 


DNA microarrays will substantially increase the speed at 
which differential gene expression can be analyzed and 
gene functions can be elucidated. Genome sequencing 
programs have already produced large amounts of se¬ 
quence data. Components or pathway engineering is ex¬ 
pected to accelerate research and improve knowledge in 
the fields of cellular and molecular biology. To fully fulfill 
these expectations, further improvement of the technology 
with respect to reproducibility, speed, cost and sensitivity 
will be needed. Consequently, sufficient attention should 
be paid to the development of biological model systems, 
which will facilitate further optimization that is asked for in 
its various applications. Information obtained from genom- 
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ics, large-scale expression analysis, proteomics and me¬ 
tabolite profiling will be invaluable to identify gene functions, 
pathways and interactive cellular physiology. DNA microar¬ 
rays containing a full animal or plant genome might soon 
be available in more model systems and will certainly con¬ 
tribute to a precise knowledge on all events occurring dur¬ 
ing growth, development, differentiation, and pathogenesis 
and will be crucial for the discovery of gene function. 
Small-scale custom arrays with dedicated sets of genes 
might also prove to be useful for a deep and thorough 
analysis of the biological processes that take place in a cell. 
It is expected that DNA microarrays will greatly help in 
studying these complex interactions in cells or organisms. 
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